61 research outputs found
On the Learning and Learnablity of Quasimetrics
Our world is full of asymmetries. Gravity and wind can make reaching a place
easier than coming back. Social artifacts such as genealogy charts and citation
graphs are inherently directed. In reinforcement learning and control, optimal
goal-reaching strategies are rarely reversible (symmetrical). Distance
functions supported on these asymmetrical structures are called quasimetrics.
Despite their common appearance, little research has been done on the learning
of quasimetrics.
Our theoretical analysis reveals that a common class of learning algorithms,
including unconstrained multilayer perceptrons (MLPs), provably fails to learn
a quasimetric consistent with training data. In contrast, our proposed Poisson
Quasimetric Embedding (PQE) is the first quasimetric learning formulation that
both is learnable with gradient-based optimization and enjoys strong
performance guarantees. Experiments on random graphs, social graphs, and
offline Q-learning demonstrate its effectiveness over many common baselines.Comment: Project page: https://ssnl.github.io/quasimetric/ Code:
https://github.com/SsnL/poisson_quasimetric_embeddin
Improved Representation of Asymmetrical Distances with Interval Quasimetric Embeddings
Asymmetrical distance structures (quasimetrics) are ubiquitous in our lives
and are gaining more attention in machine learning applications. Imposing such
quasimetric structures in model representations has been shown to improve many
tasks, including reinforcement learning (RL) and causal relation learning. In
this work, we present four desirable properties in such quasimetric models, and
show how prior works fail at them. We propose Interval Quasimetric Embedding
(IQE), which is designed to satisfy all four criteria. On three quasimetric
learning experiments, IQEs show strong approximation and generalization
abilities, leading to better performance and improved efficiency over prior
methods.
Project Page: https://www.tongzhouwang.info/interval_quasimetric_embedding
Quasimetric Learning Code Package:
https://www.github.com/quasimetric-learning/torch-quasimetricComment: NeurIPS 2022 NeurReps Workshop Proceedings Trac
Learning to Synthesize a 4D RGBD Light Field from a Single Image
We present a machine learning algorithm that takes as input a 2D RGB image
and synthesizes a 4D RGBD light field (color and depth of the scene in each ray
direction). For training, we introduce the largest public light field dataset,
consisting of over 3300 plenoptic camera light fields of scenes containing
flowers and plants. Our synthesis pipeline consists of a convolutional neural
network (CNN) that estimates scene geometry, a stage that renders a Lambertian
light field using that geometry, and a second CNN that predicts occluded rays
and non-Lambertian effects. Our algorithm builds on recent view synthesis
methods, but is unique in predicting RGBD for each light field ray and
improving unsupervised single image depth estimation by enforcing consistency
of ray depths that should intersect the same scene point. Please see our
supplementary video at https://youtu.be/yLCvWoQLnmsComment: International Conference on Computer Vision (ICCV) 201
Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning
In goal-reaching reinforcement learning (RL), the optimal value function has
a particular geometry, called quasimetric structure. This paper introduces
Quasimetric Reinforcement Learning (QRL), a new RL method that utilizes
quasimetric models to learn optimal value functions. Distinct from prior
approaches, the QRL objective is specifically designed for quasimetrics, and
provides strong theoretical recovery guarantees. Empirically, we conduct
thorough analyses on a discretized MountainCar environment, identifying
properties of QRL and its advantages over alternatives. On offline and online
goal-reaching benchmarks, QRL also demonstrates improved sample efficiency and
performance, across both state-based and image-based observations.Comment: Project Page: https://www.tongzhouwang.info/quasimetric_rl
Diverse Image Generation via Self-Conditioned GANs
We introduce a simple but effective unsupervised method for generating
realistic and diverse images. We train a class-conditional GAN model without
using manually annotated class labels. Instead, our model is conditional on
labels automatically derived from clustering in the discriminator's feature
space. Our clustering step automatically discovers diverse modes, and
explicitly requires the generator to cover them. Experiments on standard mode
collapse benchmarks show that our method outperforms several competing methods
when addressing mode collapse. Our method also performs well on large-scale
datasets such as ImageNet and Places365, improving both image diversity and
standard quality metrics, compared to previous methods.Comment: CVPR 2020. Code: https://github.com/stevliu/self-conditioned-gan.
Webpage: http://selfcondgan.csail.mit.edu
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study
In the era of large models, the autoregressive nature of decoding often
results in latency serving as a significant bottleneck. We propose a
non-autoregressive LM-fused ASR system that effectively leverages the
parallelization capabilities of accelerator hardware. Our approach combines the
Universal Speech Model (USM) and the PaLM 2 language model in per-segment
scoring mode, achieving an average relative WER improvement across all
languages of 10.8% on FLEURS and 3.6% on YouTube captioning. Furthermore, our
comprehensive ablation study analyzes key parameters such as LLM size, context
length, vocabulary size, fusion methodology. For instance, we explore the
impact of LLM size ranging from 128M to 340B parameters on ASR performance.
This study provides valuable insights into the factors influencing the
effectiveness of practical large-scale LM-fused speech recognition systems.Comment: ICASSP 202
- …